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Abstract 

In this paper, we consider a two-hop relay-assisted cognitive downUnk OFDMA system (named as 
secondary system) dynamically accessing a spectrum licensed to a primary network, thereby improving 
the efficiency of spectrum usage. A cluster-based relay-assisted architecture is proposed for the secondary 
system, where relay stations are employed for minimizing the interference to the users in the primary 
network and achieving fairness for cell-edge users. Based on this architecture, an asymptotically optimal 
solution is derived for jointly controlling data rates, transmission power, and subchannel allocation to 
optimize the average weighted sum goodput where the proportional fair scheduling (PFS) is included 
as a special case. This solution supports decentralized implementation, requires small communication 
overhead, and is robust against imperfect channel state information at the transmitter (CSIT) and sens- 
ing measurement. The proposed solution achieves significant throughput gains and better user-fairness 
compared with the existing designs. Finally, we derived a simple and asymptotically optimal scheduling 
solution as well as the associated closed-form performance under the proportional fair scheduhng for a 
large number of users. The system throughput is shown to be O (N{1 — qp){l — q^) Inhi A'^), where 
Kc is the number of users in one cluster, N is the number of subchannels and qp is the active probability 
of primary users. 

EDICS Items.- WIN-CLRD, WIN-CONT. 
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I. Introduction 

Dynamic spectrum access Q] is a new paradigm to meet the challenge of the rapidly growing demands 
of broadband access and the spectrum scarcity for designing the next-generation wireless communication 
systems. This motivates the study in this paper on designing a two-hop relay-assisted cognitive OFDMA 
system which dynamically shares spectrum access with a primary system (PU) by exploiting its idle 
periods. 

A. Related Work and Motivation 

The issues of power control for dynamic spectrum access in ad hoc networks are addressed in IH, 
|[3l . Bl . Cellular systems using cognitive radio for dynamically accessing the television spectrum are 
being standardized by the IEEE 802.22 working group. In iQ, a joint beamforming and power control 
algorithm is proposed for a cognitive cellular systems to mitigate interference to the primary network. A 
key obstacle for implementing dynamic spectrum access in cellular systems is that direct transmission 
from base stations to cell-edge users requires large power and thus causes strong interference to the users 
in the primary networks. As a result, the users in the cell-edge will have very small access opportunity 
due to the primary user activities and this fairness issue cannot be solved by simply fair scheduling at 
the base station because the users on the cell edge is limited by the channel access opportunity rather 
than the scheduling opportunity. Hence, relay-assisted cellular system will be an effective solution for 
alleviating the above fairness issue because it helps to reduce the transmission power required to reach 
the mobiles on the cell edge. However, there are still a few critical issues associated with the design and 
operation of relay-assisted CR systems as summarized below. 

• Optimal Decentralized Power, Rate and Subchannel Allocation Algorithm: Extensive research 
has been carried out on resource allocation in point-to-point relay-assisted communication systems. 
Power and subchannel allocations for relay-assisted OFDMA systems are studied in ||6l, Q, HI, fOl, 
ifTOl . ifTTl . |[T2l . |[T3l . lfT4ll . ifBl . |[T6ll . However, these existing works consider centraUzed solution 
(e.g. at BS) in which the resource (power, rate and subchannel) allocation of the BS and the RSs is 



computed in a centralized manner at the BS based on the global system state knowledge 



Hence, 



'Global system state refers to the aggregate of the channel state information (CSI) of all the BS-RS links, the RS-MS links, 
the BS-MS links as well as the sensing measurements of the BS and all the relays. 
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the conventional centralized approach is very difficult to implement in practice due to huge signaling 
overhead and computational complexity. Moreover, various simplifying assumptions were made in 
these literatures to simplify the resource allocation problem in 2-hop OFDMA systems at the cost 
of performance loss. For example, one typical constraint is that the relay can only receive the data 
for one MS in each subchannel, and this data should be forwarded completely and exclusively to 
the target MS in one subchannel in next phase ifTOl . |[T2l . This may cause significant performance 
loss when the BS-RS link is much better than the RS-MS link. Therefore, the challenge is to have 
a decentralized solution without performance loss compared with the centralized solutions. 

• Fairness Consideration in Two Hop Systems: Conventional relay-assisted cellular systems perform 
resource allocation to maximize the sum-throughput lH, Q. Yet, fairness is an important requirement 
and a general solution of fair scheduling in relay-assisted (two-hop) CR system is still not fully 
addressed. When fairness is considered in a relay-assisted system, neither the optimization objective 
nor the flow balance constraint for the relays is convex. Therefore, the conventional approaches for 
the sum-throughput optimization in the previous works cannot be applied, and how to solve such 
resource allocation problem with fairness consideration in relay-assisted systems is an important 
challenge to overcome. 

• Dynamic Spectrum Access with Imperfect CSIT and Sensing Measurement: In conventional 
resource optimization problems in relay-assisted systems ||6l, Q, there is no consideration on 
dynamic spectrum sharing aspects. However, the presence of PU activity and dynamic spectrum 
sharing has changed the fundamental dynamics of the resource allocation problem. For efficient 
spectrum sharing, it is critical for the CR systems to be able to exploit the temporal and spatial 
burstiness of the PU activity gaps and yet at the same time, without interrupting the PU transmissions. 
This problem is even more challenging when we have to take into account the imperfect channel 
state information and sensing measurement in which interference to the PU cannot be completely 

^By decentralized, we mean the resource control actions at the BS and the M RSs are computed locally at the BS and each of 
the M RS respectively based on the local system state at each nodes. There are also explicit message passing between the BS and 
the M RS nodes. Local system state at the BS refers to the CSI of the BS-mobile, BS-relay links and the sensing measurement 
of the BS; local system state at the m-th RS refers to CSI of the m-th RS to all its MSs and the sensing measurement of the 
m-th RS. Thus, the global system state is the aggregation of local system states at BS and all the M relays. 
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avoided. 

B. Contributions 

The key contributions of our work are summarized as follows. We consider a cluster-based two-hop 
RS-assisted cognitive OFDMA system, as shown in Figure [T] and [2l We are interested in the associated 
resource control problem, which is a difficult non-convex problem. Moreover, traditional centralized 
optimization approach requires significant communication overhead between the base station and the relay 
stations, and has exponentially many control variables w.r.t. the number of independent subchannels. In 
order to tackle these difficulties, we divide and conquer the resource control problem into a base station 
master problem and the relay station subproblems, where the number of control variables is significantly 
reduced (grows linearly w.r.t. the number of frequency bands). We derive a low-complexity, low-overhead 
and decentralized algorithm for controlling power, rate, and subchannel allocation, which asymptotically 
maximizes the weighted sum goodput (average b/s/Hz successfully receivedhy the MS) under the primary- 
user interference constraint. We also include the well-known proportional fair scheduling (PFS) as a 
special case in our formulation. The solution accounts for multiuser diversity, user fairness, imperfect 
channel state information at the transmitter (CSIT) and spectrum sensing. As shown by simulations, the 
proposed resource allocation algorithm significantly improves the fairness for cell-edge users. Finally, a 
simple and asymptotically optimal scheduling policy as well as the closed-form performance for PFS 
is derived to obtain design insights. For instance, we show that the throughput of the proposed two- 
hop relay-assisted cognitive OFDMA system under PFS is O {N{1 — qp){l — Qp) Inln Kc), where Kc 
is the number of users in one cluster, N is the number of independent subchannels and qp is the active 
probability of primary users on one subchannel. 

The remainder of this paper is organized as follows. The system model is described in Section JIl In 
Section [nil the problem of optimal power, rate, and subchannel allocation is formulated; the solutions 
are presented in Section JV] Asymptotic throughput analysis is given in Section |Vl Section |Vl] contains 
simulation results, followed by concluding remarks in Section IVIII 
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II. System Model 

A. Architecture and Protocol 

As illustrated in Figure [H the secondary user (SU) system is a cluster-based relay-assisted cognitive 
OFDMA downlink system consists of one base station (BS) transmitting to K mobile users (MS), where 
communications are assisted by M relay stations (RS) as elaborated shortly. The cell is divided into 
M + 1 clusters as shown in Figure [T] The central cluster (served by the BS) is indexed as the 0-th 
cluster, whose users directly communicate with the base station over relatively short distances. Each of 
the remaining M clusters is served by a half-duplexing RS^. Specifically, each RS forwards data packets 
from the base station to users in the its cluster using the decode-and-forward (DaF) strategy. The number 
of users in the m-th cluster is denoted as Km- For the notation convenience, we assume that the first M 
users in the 0-th cluster are the M RSs, and the remaining Kq — M users in the 0-th cluster are the MSs 
of the 0-th cluster {Ko + Ki + ... + Km = K + M). 

The above secondary user (SU) system is assumed to opportunistically access a spectrum licensed to 
another network, whose users are referred to as the primary users (PU) and have the highest priority 
of using the spectrum. Primary users are distributed over the service area of the SU system. To avoid 
interrupting the communication of primary users, every transmitter (including the BS and the RSs) of 
the SU system is not allowed to transmit if there is active PU in the coverage. 

The protocol for relay transmission is described as follows. The channels are assumed to be frequency 
selective and divided into N independent subchannels using the orthogonal frequency division multiplex- 
ing (OFDM) modulation ifTTll . Downlink transmission is divided into frames, each with two phases (as 
illustrated in Figure |2l). In phase one, the base station delivers packets to the MSs of the 0-th cluster and 
all the RSs; in phase two, each RS forwards data packets to the MSs in the corresponding cluster. To 
avoid interfering MSs in other clusters, we have the following assumption: 

Assumption 1: The base station does not deliver packets in phase two. In order to control the inter- 
cluster interference between two adjacent relay clusters, the transmitted signals at the adjacent RSs are 
spread by different orthogonal spreading sequences in the frequency domain as illustrated in Figure [U 

'in this architecture, the system design still has the flexibility that each MS can be served by multiple RSs and BS: each MS 
can be treated as multiple virtual MSs, each served by one RS. 
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B. Channel Model 

The channel reaUzation is assumed to be quasi-static over one frame but independent and identically 
distributed (i.i.d.) across different frames. Channel gains are characterized by the long-term path loss, 
shadowing and the short-term fading. The symbol received at the fc-th user of the m-th cluster in the 
n-th subchannel, denoted as Ym,n,k, can be written as 

"Ym,n,k — \/ Pm,n,k^m,kHm,n,k-^m,n,k ~l~ ^m,n,k> 

where ^ is the transmitted symbol, Pm,n,k is the transmission power, lm,k is the long-term channel 
attenuation due to path loss and shadowing, Hjn,n,k ~ CJ\f{0, 1) models short-term fading, and Zm,n,k ~ 
CJ\f{0, 1) represents the additive white Gaussian noise. Note that Hm,n,k represents the channel between 
the k-th user and the base station if m = 0, or the m-th relay station if m > 0. 

The BS and RSs adapt the data rates, power, subchannel allocation for the downlink transmission based 
on the CSI at the transmitter (CSIT). We consider a time division duplex (TDD) system where the CSIT 
can be acquired by channel reciprocal iTTSl . Due to CSI estimation noise as well as duplexing delay, the 
CSIT obtained will not be accurate and the CSIT error model (based on MMSE prediction) is given by 

m-. 

Hm,n,k = Hm,n,k + ^Hm,n,k, Vm, n, k (1) 

where Hm.n,k represents actual CSI, /S.Hm,n,k represents the CSIT error which is modelled as complex 
Gaussian distribution with mean and variance cjg (Aff^ „ ^ CM{0, cr^)), and Fi[AHm.n,kHm,n,k] = 
(meaning that the estimation error AHm,n,k is uncorrected to CSIT Hrn,n,k)- For convenience, the CSIT 
is grouped according to cluster as the sets Hm = ^n,k{Hm,n,k} for < m < M, which are referred to 

M 

as local CSIT at the m-th cluster. The set H = |J £[„ is called as global CSIT. 

m=0 

C. Dynamic Spectrum Access and Fairness Issues 

In each cluster, each secondary user senses the spectrum and searches for subchannels unused by 
primary users, which, for instance, may be wireless microphones or other Part 74 devices |[19J. The 
spectrum sensing results consist of binary indicators specifying the availability of subchannels. These 
sensing resutls are referred to as raw sensing information (RSI) in this paper. Let Sm,n,k € {0, 1} 
denote the sensed state at the k-th user on the n-th subchannel in the m-th cluster, where Sm,n,k = 1 
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and correspond to the states "available" and "unavailable", respectively. RSI Sm = {'S'm,n,A:|V?i, A;} is 
communicated by users to their corresponding servers (BS/RS) for enabling resource allocation. Moreover, 
we also define the aggregation of RSI from all clusters as S = {SmjVm}. Let Sm,n be the actual 
primary-user state on the n-th subchannel in the m-th cluster with Sm,n = 1 denoting subchannel is 
actually available and Sm,n = denoting otherwise, = {Sm.nlVn} be the actual PU activity of all 
the subchannels in the m-th cluster and S = {Sm|Vm| be the aggregation of actual PU activity of all 
clusters which is quasi-static over a number of framed Moreover, define qp = Pv{Sm,n = 1) as the 
probability one subchannel is available, which is assumed to be identical for all m and n. In practice, we 
cannot have perfect sensing at the mobile and there exist nonzero probabilities for the events false alarm 
{qf = Pr{Sm,n,k = 0\Sm,n = 1)) and mis-detection {qm = Pr{Sm,n,k = ^Sm,n = 0)j 1201. Moreover, 
Qd = ^ — Qm represents the probability of detection. 

Due to the imperfect sensing measurement, it is not possible to eliminate the interference from the SU 
to the PU systems. To protect communication in the PU networks, we require 

Im,n = C^Pm,n,k)'^m,n{'^ - ^[Sm,n\^m,n]) < 1, ^171,71, (2) 
k=l 

where Im,n is the conditional average interference level (conditioned on the sensing measurement) from 
the SU (at the m-th cluster and the n-th subchannel) to the active PU, Sm.n = {Sm,n,k\k G {l,Km}}, 
Pm,n,k is the transmit power of the m-th RS (or BS) to its k-th MS in the n-th subchannel, Tm,n is the 
path loss between the SU transmitter (at the ?n-th cluster and the n-th subchannel) and the active PU. 
Thus, each SU transmitter should guarantee that the average interference to the active PU in its cluster 
area is not larger than one tolerance threshold /. 

Remarks (Fairness Issue with Cognitive OFDMA Systems without RS): Consider a simple scenario 
where we have one PU in each of the M RS clusters as well as the BS cluster as shown in Figure [T] As 
a result, there are M + 1 PUs in the system. Let Qp be the probability that the PU in a cluster becomes 
active in one subchannel. If there are no RS in the SU system in Figure [T] the access opportunity of a 
cell-edge user (users in the cluster m > 0) in one subchannel is (1 — gp)^^"*"^, which is the probability for 
all the M + 1 PUs in the BS's coverage area to be idle. Hence, the cell-edge users could hardly access 
the spectrum even for moderate PU activity, leading to critical fairness issue. 

■^In practice, the PU activity changes over a longer time scale compared with the CSI. 
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III. Joint Control of Rate, Power and Subchannel Allocation: Problem Formulation 

In this section, we shall formulate the rate, power and subchannel allocation design as an optimization 
problem. We first formally define the optimization variables (control policies) as well as the optimization 
objectives below. 

A. Definitions of Control Policies 

Consider transmitting to the k-th user in the 0-th cluster (the BS's cluster) over the n-th subchannel. 
The transmission power, rate and percentage of subchannels the base station allocates to the user is 
denoted as po,n,fc(H,S), ro,n,A:(H,S) and ao,n,/c(H,S) respectively, which are adapted to the imperfect 
CSIT H and RSI S. The corresponding polices for controlling transmit power (Vq), subchannel allocation 
(^o) and transmit data rate (TZq) are defined as the function sets Vq := |po,n,fc(H5 S)|Vn, = 
|oo,n,fc(H, S)|Vn, /c|, and TZq = |ro,n,fc(H, S)|Vn, These policies must satisfy a set of constraints. 
Specifically, assuming the total transmission power at the base station is fixed at Pq, 

N Ko 

Power constraint (BS): J]] ^po,n,fc(H, S) < Pq. (3) 

n=l k=l 

By definition, the percentages of subchannels allocated to different users/relay-stations satisfy 

K„ 

Subchannel allocation constraint (BS): ^ ao,n,fc(H, S) < 1, Vn € [1, N]. (4) 

n 

Furthermore, the data rates are adjusted under a constraint on the per-hop packet error probabilit>0 Pout, 
namely that for given a per-hop PER constraint < e < 1 

Per-hop outage constraint (BS): Pout{ro,n,k,'tl) = Pr(ro,„,fc > fHo,n,fc|H) = e, Vn e [1, N],k e [1, Kq], 

(5) 

where yio,n,k is the maximum achievable data rate from the base station to k-th user in the n-th subchannel. 

Each packet transmitting from the base station to a relay station is designed to contain information 
bits for users to be served by this RS in the cluster. Let dm,n.k be the fraction of k-th user's information 

'We assume sufficiently strong coding, sucii as LDPC, is used so that the PER is dominated by the channel outage (transmit 
data rate less than the instantaneous mutual information). This is reasonable as it has been shown 1211 that LDPC for reasonable 
block size (e.g. Skbyte) could achieve the Shannon's limit to within 0.05dB. 
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bits in a packet transmitted over the n-th subchannel and received at the m-th relay station. It follows 
from the definition that 

Packet partition constraint (BS): ^ dm,n,fc < 1, Vm > 0,n. (6) 

fc=i 

The base station is assumed to control {dm,n,k} based on the CSIT and RSI. The corresponding control 
policy for the m-th RS is defined as Vm := |dm,n,/c(H, S)|Vn, A;|. Moreover, we also define the system 
packet partition policy as P = |J T>m- 

m=l 

The policies used by a relay station depend on the packet receiving status of the phase one transmission. 
Let tn^m £ {0, 1} denote the indicator of the decoding state of the m-th relay station on the n-th 
subchannel, where t^^m = 1 means the corresponding packet is decoded successfully and tn,m = means 
otherwise. Moreover, define the set = {in,m|Vn G [1,A^]}. Adding the newly defined sets as input, 
the policies for controlling power, rate, and subchannel allocation at relay stations are defined similarly 
to those for the base station as Vm ■= |pm,„,fc(H, S, Tm)|Vn, Am ■= |am,n,fc(H, S, T^jlV??,, fcj, 
and TZm = |?'m.n,fc(H, S, Tm)|Vn, A;|. These policies must satisfy the following constraints 

N 

Power constraint (relay): ^ ^Pm.n.fc < -fm, Vm G [1, M] (7) 

n=l k=l 

Subchannel allocation constraint (relay): ^am,n,fc < 1? Vm G [l,M],n G [1,-^] (8) 

k=l 

Per-hop outage constraint (relay): Pout{{rm,n,k}, H, T^) = FT:{rm,n,k > 9^m,n,fc|H) = e (9) 



N N 

Flow balance constraint: ^ rm,n,k < ^ dm,n,ktn,mro,n,m, Vm G [1, M], A; G [1, Km]- (10) 

n=l n=l 

where lHm,n,fc is the maximum achievable data rate from the m-th relay station to k-th user in the n-th 
subchannel, the last constraint ( fTOl ) is because the total information bits transmitted by each RS cannot 
be more than the information bits received from the BS. 

B. Average Weighted Goodput and Fairness 

The average weighted goodput is defined and used in the sequel as the metric for optimizing control 
policies discussed in the preceding section. When the PU is not active at the m-th cluster and the n-th 
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subchannel {Sm,n = 1)> the instantaneous mutual information between the m-th transmitter and the k-th 
receiver in the n-th subchannel is given by: 

, Pm,n,fc^m,fc |-f^m,n,fc I ^ -f c -i 

m,n,k — 5mQ^m, n, fc iOg2 1 ) ^in,n — -Lj 

where gm € {0.25,0.5} (go = 0.5 and gm = 0.25 Vm > 0) is a constant indicating the spectrum 
efficiency. Due to the half-duplex constraint at the base station, gm is equal to 0.5 for m = (base 
station's cluster). Moreover, due to the half-duplex constraint and the orthogonal spreading at the RSs, 
gm = 0.25 for m > 1 (relay stations' clusters). On the other hand, we have the following assumption on 
the interference from PU to SU: 

Assumption 2: We assume the power of active PU is large, so that the SU transmission in one cluster 
will fail if there is any active PU in that cluster using the same subchannel. 

Hence, when Sm,n = (PU active), there is large interference from the PU and the instantaneous mutual 
information can be regarded as Cm,n,k = 0. As a result, the instantaneous mutual information can be 
written as: , ^ 

b/ -, , Pm,n,k^m,k\Hm,n,k\ \ -j- n i 
\ ^m,n,k I 



Cm,n,i 



'^m,n,k 

0, if Sm,n = 0. 

Due to the imperfect CSIT knowledge, there is uncertainty in the instantaneous mutual information Cm,n,k 
at the transmitters and hence, there will be potential packet errors due to channel outage if the scheduled 
data rate exceeds Cm,n,k- This packet error is systematic and cannot be alleviated by using strong error 
correction coding. As a result, we shall consider goodput (b/s successfully delivered to the mobiles) as 
our performance measure. The instantaneous goodput over the {in, n, k)-th subchannel is defined as 

Um,n,k — ^in,n,k^{'^m,n,k ^ Cm,n,k) 

— '^m,n,kSm,'nX{j'm,,n,k ^ ^m,n,fc)) 

where I(j4) is the indicator function with value 1 when the event A is true and otherwise. 

Let {wm.,k} be a set of goodput weights for different users (the weight for the /c-th user in the m- 
th cluster is Wm,k)^ whose values are set according to the users' QoS priorities. The average weighted 



June 25, 2010 



DRAFT 



11 



goodput is given below: 
G{A,V,V) := E„,„jj 



M N K„ 



WQ^kUQ,n,k + Wm,kUm,n,k 
'-n=lk=M+l m=ln=lk=l 

N Ko -I M 



E, 



S,H 



Es 



H 



y^ y^ wo^kUo,n,k I s,H 

i-ji=lfc=A/+l 



+ y^ Es,H 



m=l 



y^ y^ ^^m,fc^m,n.,fc I S, H 
.n=l fc=l 



G(A,P,D|S,H) 

where G defined above is referred to as the conditional average system goodput (conditioned on S,H), 
A = {^m|V?Ti G [0,M]}, V = {P,„|Vm G [0,Af]} and V = {Pm|Vm € [1,M]} are the subchannel 
allocation policy, power allocation policy and packet partition policy of the system respectively, A, P 
and D denote the subchannel allocation action, power allocation action and packet partition action of the 
system respectively for a given global CSIT S and global RSI H . 

Remarks (Incorporating Fairness in the weighted Goodput): Note that the optimization objective in 
the above equation embraces fairness in the resource allocation. For instance, users with higher priorities 
could be allocated a larger weight Wm,k- Furthermore, proportional fair scheduling (PFS), which is a 

-, where Wm,k{t) is 



commonly used fairness attribute, is also embraced by setting Wmk{^) = ^ 

' Rm,k{t) 

the weight of the A;-th users at the m-th cluster and t-th frame and Rm,k{t) is the measured average 
throughput of this user. Rm,k{t) is updated on each frame according to Rm,k{t) = (1 — j-)Rm,k{t ~ 1) + 
F S^=i i"m,n,k{t), where tg is the duration of one frame and rm,n,k{t) is the scheduled data rate of the 
user in the t-th frame. 
Notice that 

Es,H[f^O,n,fc|S, H] = (3q 

,n^O,n,k 

(1 - Pr[ro,n,fc > lRo,n,fc|H]) 



Es,H 



N K„ 



y^ y^ Wm,fcC4i,n,fc I S,H 



.n=l k=l 



/3o,n'"0,n,fc(l - e) 

TV K. 

Et„„s„„h 



En 



En 



En 



N K„ 



y^ y^ Wm,kUm,n,k 1 S, H 
n=l k=l 

y^ y^ Es„,H„[lfm,fcf4i,n,fc|Tm] | S,H 
,n=l k=l 
N 

y^ y^ ^m,fc/3m,n'"m,n,fc (l Pl'['"m.,n,A: ^ ^m.,n,fc|H]) 
n=l k=l 

N K„, 

^ ^ ^ ^ Wm,kPm,n'^m,n,k (l 



n=l k=l 
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where Prn,n = E[5m,n|S] is the probabiUty that the n-th subchannel in the m-th cluster is available given 
the sensing feedbacks from the mobiles and Pr[rm,n,fc > ^m,n,fc|H] = e (Vm, n, k) is conditional packet 
error probability of one-hop link for given H, G can be written as 

N Ko M r- N 

G(A,P,D|S,H) = ^ ^«0,fc/3o,n^^O,n,fc(l - e) + ^ Et„ ^ ^ «^m,fc/?m,,n'^m,n,fc(l - e) 



n=l k=M+l 



m=l 



n=l k=l 



Go(Ao,Po|S,H) 



G„(Ao,Po,A™,D„.,P„|S„,H„i,T„) 



or 



M 



G(A, P, D|S, H) = Go(Ao, Pq |S,H) + ^Et„G^(A 



m=l 



where A^ = {am,n,fc|Vn, A;}, P„ = {pm,n,fc|Vm, n, A;}, = {dm,n,k\^n,k} and D = {D^lVm}, 
A = {A„,|Vm}, P = {P„,|Vm}. 



C. Problem Formulation 

Since a policy consists of a set of actions for each realization of CSIT and RSI, finding the optimal 
policy is equivalent to the following problem. 

Problem 1: For each given CSIT H and RSI S realization, we have: 

{A*(H,S),P*(H,S),D*(H,S)} 



max Go(Ao,Po|S,H) + VEt 

vo,Po,D I — V 
^ m=l 



max G'm(Ao, Pq, Am, Dm, Pm jSm, H^, 

A„,,P,„ 

^ V 

Local Optimization on G„ 



s.t. the constraints in (|2l)-([T0l). 



Remarks (Comparison with Traditional Resource Allocation Problem in OFDMA Systems): 

Noting that neither the objective function nor the constraint (fTOl ) is convex, the traditional optimization 
approaches in |[6l, Q cannot be applied in this problem as the duality gap is not zero. Moreover, due to 
the potential packet error at the BS-RS link, the traditional centralized controller needs to solve 0{M2^) 
control variables for all possible Tm realization in Problem [T] (RS 's control actions are the function of 
Tm). Thus, the brute force solution for Problem [T] involves unacceptable computational complexity and 
huge communication overhead between the BS and the RSs. In this paper, we shall show how to divide 
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and conquer this non-convex optimization problem into the optimization problem at the BS and RSs. By 
appropriate design of backward recursion and online strategy, the system only need to solve 0{MN) 
control variables. Furthermore, the algorithm can be implemented distributively in the system and the 
communication overhead between the BS and RSs is very small. 

In Problem [U the local optimization on Gm with respect to and ( max Gm{-)) is subject to 
the constraints dD (m > 0) and (TTll-dTOl). As a result, for a given Phase-I receiving status {ro,„,min,m} 
and the packet partitioning {d^nk}^ these local optimizations on max Gm{-) can be done locally at 

' ' A„,,P„, 

the m-th RS for m € 1, .., M. Therefore, using standard argument of primal decomposition 11221 . solving 
Problem [T] is equivalent to solving the following two subproblems: 
Subproblem 1 (Optimization at m-th RS): 

^ A™^p I^-^ ~ ^)'^rn,k/3m,nam,n,k log2(l + ""'"'^ ^m,n,k) 

s.t. the constraints in ^ {m > 0), (TTt-dTOll. 
where <fm.n,k = ^^m,n(^e^^ p/i 2 (^) which is obtained from the outage probability constraint, and 

I 7-n,n,k\ / 2 

1 ^{■) denotes the inverse cdf of non-central chi-square random variable with 2 degrees of 
freedom and non-centrality parameter \Hm^n,k\'^ 
Subproblem 2 (Optimization at the BS): 

M 



max Go(Ao, Po|S, H) + > ^^T„,G'^{{rQ n,mtn,m},{dm.,n,k}\Sm,^m) 
Ao,Po,D 

m=l 

N Ko ^ 

M 

m=l 

N Ko ^ 

yZ Yl - «)^'0.fc/3o,nao,n,fclog2(H —^0,n,k) 

Ao.Po 2 ' OLnnk 

M 

Hm) (11) 

m=l n 

S.t. the constraints in (ID) (m = 0), and Q-©, 
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where 

^Tn( / ^ ^0,rt,m^rt,m|Sm; Hyft) — IHaX Gj^(^{TQn,min,m}i{d^ji.k}\^mj^^m)- 
^ {d™.„.fc|Vfc} 

The divide and conquer procedure to solve Problem [T] is given below: 

• Backward Recursion: At the beginning of each frame, after channel estimation and sensing, each 
RS calculates and feedbacks the function G'**(rjSm, Hm) to the BS. 

• Online Strategy: In phase one, the BS solves the Subproblem |2] and delivers packets accordingly. 
In phase two, each RS (say the ?n-th RS) solves its Subproblem [T] according to the packet receiving 
status in phase one {ro,n,mtn,m\^n = 1,2, A^}, and delivers packets accordingly. 

IV. Joint Control of Rate, Power and Subchannel Allocation: Solutions 

In this section. We shall derive a low-complexity solution for the general weighted goodput opti- 
mization. The solution supports decentralized implementation which significantly reduce computational 
complexity and signaling loading. Furthermore, the solution is asymptotically optimal when the number 
of users is sufficiently large and the BS-RS links are sufficiently good. We shall also derive the solution 
for PFS as a special case. 

A. Asymptotically Optimal Algorithm 

Solution of Subproblem 1: The Subproblem 1 can be solved by using the duaUty approach |[23l . 
Specifically, the Lagrangian is given as 

/Km \ / N Km \ 

Lm = Gm. — ^ A.„ I ^ am,n,k " 1 1 ~ ^ I ^Pm,n,fc " Pm 1 

n=l \fc=l / \n=lfc=l / 

N / Km \ Km ( N 

El /3 ^ 2 Tl \ ^ I Ctm,n,A; , , Pm,n,kVm,n,k s j, 
Z^(l - Pni,n)rm,nPm,n,k -I ] - l^k [ T log2(l + ) - Rni,t 

n=l \k=l J k=l \n=l ^ ""^'"'^^ 

where Rm,k = Z]^=i tn,mdm,n,kf'o,n,m is Constant in this subproblem. Hence, the dual problem is: 
Subproblem 3 (Dual Problem of Subproblem 1): 

jnin max Lm(A, ?7, /I, j^) 

\,ff,fl,U -^m^Pm 

s.t. A, f],fl,u >: 0, 
where A ^ means each element of vector A is nonnegative. 
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The algorithm to solve the above dual problem is presented in Appendix A. Note that the Subproblem 1 
is a non-convex optimization problem because the optimization constraint is non-convex. Nevertheless, 
since the problem satisfies the property of "time sharing" as introduced in Il24l . the duality gap of the 
above problem is zero, and hence, solving the above dual problem will lead to the optimal solution of 
Subproblem 1. 

Solution of Subproblem m The expectation on the binary vector in Subproblem |2] should take over 
exponential order (w.r.t. the number of subchannels N) of possible situations, which raises unacceptable 
computational complexity. In the following lemma, we show that the expectation over the binary vector 
Tm can be decoupled into each subchannel asymptotically, therefore, the computational complexity 
become linear. 

Lemma 1 (Asymptotically Equivalent Objective): When the channels between the BS and the RSs are 
sufficiently good, one relay is scheduled at most on one subchannel. Hence, (fTTI) can be written as 

max G'o(Ao, Po|Sm, 

,n,m I , H^) . 

Ao.Po — ■ — ■ 

m=l 71=1 

Proof: Please refer to Appendix B. □ 

Since the Subproblem |2] is calculated at the BS, each RS should inform the expression of G'^^{r) to BS. 
The feedback of accurate G**{r) expression involves large feedback overhead. In the following lemma, 
we show that the feedback overhead can be significantly reduced when the user density p is sufficiently 
large: 

Lemma 2: When the user density p is sufficiently large, G^{r) is a convex piecewise linear function. 
Proof: Please refer to Appendix C. □ 

The construction of function G**{r) is presented in Appendix C as well. Moreover, an example of 
function G*^{r) is illustrated in Figure O With the conclusions of the Lemma [H the Subproblem |2] is a 
convex optimization problem and can also be solved by the duality approach (Similar to Subproblem [T]) 
which is presented in in Appendix A. As a result, the overall decentralized resource allocation algorithm 
for the relay-assisted CR system is summarized below: 

Algorithm 1 (Decentralized Asymmetrical Optimal Control Algorithm): The overall decentralized con- 
trol algorithm includes the following steps: 
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• Step 1 (Cluster-Based Spectrum Sensing): For m = {0, .., A/}, mobiles in cluster ?n deliver the 1-bit 
RSI to the cluster controller (BS or RS). 

• Step 2 (Backward Recursion): The m-th RS feeds back the function G^{-) to the BS. 

• Step 3 (Online Strategy — Phase One): From the local CSI (Hq), local RSI (Sq) and the 
BS determines the power, rate and subchannel allocation of the mobiles in cluster as well as the 
RSs using the iterative algorithm for Subproblem 2 in Appendix A. 

• Step 4 (Online Strategy — Phase Two): If the m-th RS decodes the information from the BS 
successfully, it will determine the power, rate, subchannel allocation to the MSs in its cluster based 
on the local CSI (H,„) and RSI (S^) using the solution of Subproblem 1 in appendix A. 

Remarks: The solution is decentralized in the sense that the computational loading is shared between 
the BS and the RSs. Furthermore, only local CSI is needed at the ?n-th RS and the BS and this substantially 
reduces the required signaling loading to deliver the global CSI in conventional centralized approach. 
While the m-th RS needs to feedback GJ^(-) to the BS, the required signaling loading is very small 
because G'5^(-) is a piecewise-linear function (as illustrated in Figure [3]l and it can be characterized by 
0(ML) parameters in the worst case (L is the number of QoS levels). 

B. PFS scheduling for Two-Hop RS-Assisted Cognitive OFDMA System 

The system objective function of PFS is given by ^ , where ^ is the average throughput of 

I Rm,k ' 

m,n,k 

the k-th user in the m-th cluster. As a result, the PFS is a special case of the weighted goodput objective 
considered in the paper. Yet, brute-force applications of the solution in the pervious section in PFS will 
incur a large signaling overhead from the RS to the BS because the G'5^(-) of PFS involves very large 
number of parameters (and hence, induce huge signaling overhead for in-th RS to feedback G^(-) to the 
BS). In the following, we obtain a simple characterization of G'^{-) (which is asymptotically optimal) 
under PFS. 

Lemma 3: Suppose the links between the base station and the relays are sufficiently good, if Km is 
sufficiently large, {f) can be simplified as follows in Subproblem 2 

^ _ n=l 47?;;^ log2(l +Pm,nim,n,A„,„¥'m,n,A„,J T < Km 

l^n=i — ■ — '— log2(l +Pm,n,tm,n,yl„,„¥'m,n,A„.„) Otherwise 
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where 



and 



N 



E> — \ ^ ^ 1 n I ; \ — l^m,nPo 

n=l Z^n=l Pm,n 



^m,n — arg max U^m./c log2(l ~^ Pm,n^m,n,k^m,n,k) ■ 
k 



Proof: Please refer to Appendix D. □ 
Since G*^{r) can be parameterized by X;^=i "'"'••^"'■"4"''"^^''^ log2(l + 



the feedback overhead to deUver GJ^(r) from the m-th RS to the BS is very small and does not scale 
with Km- 

V. Asymptotic Goodput of Two-Hop RS-Assisted Cognitive OFDMA Systems under PFS 

In this section, we analyze the asymptotic performance of the scheduling algorithm derived in the 
preceding section. Specifically, the system throughput is derived for a sufficiently large number of users 
in each cluster. To obtain insights on the performance gains, we impose a set of simplifying assumptions. 
We assume each cluster contains Kc MSs. Furthermore, we assume line-of-sight link (with high gain 
antenna) between the RSs and the BS and hence, the throughput is limited by the second hop. Finally, 
users will not be closer than 7 to the RS, where 7 is certain fixed distance. The following theorem 
summarizes the asymptotic system goodput of the relay-assisted cognitive OFDMA system under PFS. 

Theorem 1: Suppose there are M RS clusters and N independent subchannel in the system. Further- 
more, consider a simple scenario where there is one PU in each of the M RS clusters and the BS cluster, 
as shown in Figure [T] Let qp be the probability that a PU becomes active in a subchannel. For sufficiently 
large number of MSs per cluster Kc and sufficiently strong BS-RS links in the above system, the average 
throughput of the A;-th user in the m-th cluster (m > 1) achieved under the proportional fair scheduling 
is given by 

_ N{l-qp){l-q^) /■+~ 1 

Tm,k = 77 / -7 '^Og2{l + —lm,kx)dFmax,KSx) (13) 



Kc Jo 4 ' iV 
tt; — log2(l + -^lm,k In-fCc) when Kc +00, (14) 

4Ar i V 
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where Fmax,Ka{x) is the CDF of max{|iJm,n,fc||VA;}. The equivalent PFS scheduhng rule at the RS is 
given by 

Am,n = avgma,x{\Hjn,n,k\\^f^} (15) 

k 

where Am.n is the selected user of the subchannel n in the m-th cluster. 

Proof: Please refer to Appendix E. □ 

Using similar analysis as in Appendix E, it can be shown that the average goodput (under PFS) of a 
mobile in a cognitive OFDM A system without RS is given by: 

= (l-^p)''^']^log2(l + §Cl^^^^) When K,^ +00, (16) 

where Fmax,MKa{^) is the CDF of max{|i/m „ /c||Vfc, m}, l^j^ is the long-term path loss and shadowing 
from the user to the base station, the coefficient (1 — qp)^^~^^ before the logarithm is the probability that 
one subchannel is available^ the N in the numerator of the coefficient before logarithm is because there 
are N parallel independent subchannels, and the MKc in the denominator of the coefficient is because in 
each subchannel the access probability of each MS is l/{MKc). Compared with the results in Theorem 
[TJ it can be concluded that 

• The system goodput of the regular cognitive OFDM A system without relay stations is 0{N{1 — 
gp)^^+^ Inln MKc), which is very sensitive to the PU activity due to the factor (1 — qp)^^~^^. For 
moderate qp, the spectrum access opportunity of the cell-edge users is very small. 

• The spectrum access opportunity of the cell-edge users can be improved by employing relays. Active 
primary users in one relay cluster would not affect the packet transmission on other relay clusters 
as illustrated by the factor {1 — qp){l — q^) in equation (fT4l ). Moreover, the receiving SNR at the 
mobile users is significantly increased by employing relays (/m,fe >> ^m\)- As a result, the relay- 
assisted CR system can achieve much higher system throughput than the baseline system without 
relays under PFS. 

^Since qp is the probability that a PU will be active in one subchannel of one cluster, the probability that one subchannel 
being clean (no active PU) in the whole cell is (1 — qp)^'^^. Since the BS can transmit packets to the cell edge users in one 
subchannel only when this subchannel is clean in the whole cell area (i.e. all PUs in the coverage area are IDLE). Thus, the 
probability that one subchannel is available is (1 — Qp)^^'^^- 
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VI. Simulation Results and Discussions 

In this section, we shall compare the performance of the proposed relay-assisted cognitive OFDMA 
system with several baseline systems. Baseline refers to a naive design of a cognitive OFDMA system 
(without RS) where the power, rate and subchannel allocation are designed assuming perfect CSIT. 
Baseline 1 refers to the Separate and Sequential Allocation (SSA) in relay-assisted cognitive OFDMA 
system, which is a semi-distributed scheme proposed for relay-assisted OFDMA systems in |16]. Similar 
approach also appears in ifTOl . |[T2l . BaseUne 2 and 3 refer to a similar cognitive OFDMA system (without 
RS). Moreover, in baseline 1 and 2, the PU activity in RS clusters is the same as that in BS cluster, 
i.e. qpm = Qpo- In baseline 3, the PU activity in RS clusters is much lower than that in BS cluster, i.e. 
Qpm = 1 — (1 — Qpo)^^^- In Baseline 1, 2 and 3, the control policy are designed for imperfect CSIT. The 
overall cell radius of the system is SOOOnj^ in which Cluster has radium of 2000m and RS 1-6 are 
evenly distributed on a circle with radius 3000m as illustrated in Figure [T] MSs randomly distribute in the 
cell with Ko = 10 MSs in Cluster and Km = 5 in Cluster m{m = !,••• ,6). The path loss model of 
BS-MS and RS-MS is 128.1 + 37.6 logio(i2) dB, and path loss model of BS-RS is 128.1 + 28.8 logio(fi) 
dB {R in km). The lognormal shadowing standard deviation is 8 dB. There are 64 subcarriers with 4 
independent subchannels. The small scale fading follows CJ\f{0, 1). We set up our simulation scenarios 
according to the practical settings ll26l . The average interference constraint to the PU is dB. Each point 
in the figures is obtained by averaging over 2000 independent fading realizations. 

System Performance versus PU Activities: Figure H] illustrates the PFS objective log (average 
sum-log-rate successfully received by each MS) and access probabilitjO of MSs in Cluster m (m = 
I,-- - , M) versus PU activity Qp at receive SNR = 10 dB and o-g = 0.01. X]fcl°S^fc ^^'^ access 
probability decrease with the increase of PU activities. It can be observed that our proposed scheme 
provides much greater access probability as well as fairness/throughput performance for the MSs at the 
cell edge compared with baseline 2 and 3 over a wide range of PU activities. This performance gain is 
contributed by the conventional RS path-loss gain as well as the increase in the access opportunity for 

^5000 m is one of the typical cell radius for LTE and LTE-advanced systems (e.g. rural area) |25| . 

^"^^k log^Rfe is the PFS optimization objective which is a good indication on the tradeoff between throughput and fairness. 
'Access probability is the probability that a MS on the cell edge is allowed to receive data on at least one subchannel in a 
scheduling slot. 
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MS at the edge. 

Figure [5] illustrates histogram of the average goodput of MSs (average data rate successfully received 
by the MSs) at various distance from the BS at receive SNR = 10 dB and = 0.01. It can be 
observed that basehne 2 can deliver large system goodput only for those MSs close to the BS. It has very 
low access probability and average goodput for those far-away mobiles, causing severe fairness issues. 
However, there is a significant gains in the system goodput of far-away MSs in the proposed system and 
baseline 1, illustrating both the throughput and fairness advantage of the system with RSs. Furthermore, 
the proposed scheme outperforms SSA scheme in baseline 1. Figure [6] illustrates the corresponding CDF 
of average goodput of MSs at various distance. The low average goodput regime (x-axis) demonstrates 
the performance of cell-edge users: the larger probability (y-axis) in low average goodput regime the 
larger average goodput of the cell-edge users. It can be observed that our proposed scheme brings better 
performance (larger average goodput to the cell-edge users and sum average goodput of all users ) 
compared with the baselines. 

System Performance versus Receive SNR: Figure |7] illustrates the PFS objective log Rk (average 
sum-log-rate successfully received by each MS) and access probability of MSs in Cluster m (m = 
1, • • • , M) versus receive SNR. It can be observed that our proposed design has significant gain over 
the baseline 1, 2 and 3 systems. The gain is more prominent at low SNR region because the conventional 
RS reduce the path loss greatly and utilizes the limited power more efficiently. 

System Performance versus the Number of MSs: Figure [8] illustrates log Rk (average sum-log- 
rate successfully received by each MS) versus the number of MSs in a cell at receive SNR = 10 dB 
and (jg = 0.01. The ratio between Kq and is kept constant. While the proposed scheme has the best 
performance over basehne 2 and 3, the performance of all the three schemes increases with K, which 
demonstrated the multi-user diversity gain in the system. 

System Performance versus CSIT quality: Figure |9] illustrates the average system goodput (average 
data rate successfully received by the MS) versus CSIT quality. The performance gain of the proposed 
scheme versus baseline 1 illustrates the robustness of the proposed scheme w.r.t. CSIT errors. On the 
other hand, comparison between baseline 2 and baseline illustrated that it is very important to take 
CSIT enors into the design. Baseline has very poor performance because there are a lot of error packets 
due to channel outage. 
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VII. Conclusion 

In this paper, we have proposed the design of downlink two-hop relay-assisted cognitive OFDMA 
system, which has the cluster-based architecture and dynamically shares the spectrum of PU systems. 
Optimal decentralized algorithms have been derived for joint rate and power control, and subchannel allo- 
cation at the RS and the BS respectively. These algorithms maximize the weighted system goodput where 
proportional fair is included as a special case. The solution processed local system state measurement at 
the BS and the RS to compute (locally) the power, rate and subchannel allocations of the BS and RS. 
Imperfect system state measurement has been taking into consideration to maintain robust performance 
of the SU and the PU systems. Significant throughput gains have been observed from simulation results. 
We have also derived a simple (asymptotically optimal) control algorithm as well as the closed-form 
performance for PFS for sufficiently large number of users. 

Appendix A: Solution of Subproblem[T]and Subproblem[2] 
The gradient of Lm in subproblem 1 vanishes at the maximum, so we have 

dim _ _ f - e)/3m,nW^m,fc - Aifc) 

U ^ Pm,n,k CXm,n,k 



,n\^ Pm,n)) fm,n,k 
={)^Xm,n,k = -^[{^ - e)firn,nWrn,k- ^J'k)[log2\^^ — ^' 



Pm,n,k^m,n,k 



An (17) 



ln2 {Orn^n,k ~l~ Pm,n,k^m,n,k') , 

and Xfn.n,k can be interpreted as marginal benefit of extra bandwidth. For a particular v, if there is a 
unique k* = argmax{Xm,n,fc} for some n, time-sharing will not happen in this subchannel. 

1, Xm,n,k = max/; {Xm,n,k} > 

0, otherwise 

Since for each given /i, Xm,n,k is a function of the CSI <fm,n,k^ they are independent random variable. 
As a result, there is probability 1 that one subchannel is assigned to a single user. 
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We use the subgradient method to update the multipliers as follows 



A„(i + 1) 

rin{i + 1) 
+ 1) 



\n{i) - - am,n,k) Vn 

fc=i 

N 

Pm,n,k ) 
n=lk=l -^X 
K,„ 

k=l 

N 



Vn 



V/c 



n=l " -no,,o,r. 

where is a sequence of scalar step size and x denotes the projection onto the feasible set, which 

contains all non-negative real numbers. The iterative algorithm terminates when the difference of two 
consecutive multipliers is less than a terminating threshold. The subgradient update is guaranteed to 
converge to the optimal multipliers A* , z^*, ?]* , /i^. 
We form the Lagrangian of Subproblem 2 as follows 

^ M N ^ N / Ko \ 

Lo = Go+Y^ ^(1 - e)l3Q^nG*^,m -^^^VY^ ^o,n,k - 1 1 

m=l n=l n=l \fc=l / 

/ N Ka \ N / Ko \ 

\n=l k=l / n=l \A;=1 / 



(18) 



dpo,n,k 



0,n,k 



PO,n,k = aO,n,fe( 

^o,n,fc = - e)/3o,nW^o,fe ( ^052(1 + 



1 



^(1 - e)f3o^nWo,k 

ln2[u + ??nTo^„(l - /3o,n)) ^0,n,k' 
PO,n,k^O,n,k s 



PO,n,k'PO,n,k 



/n2(ao,n,fc + PO,n,kV^O,n,k) 



1, ^0,n,fc = maXfc {Xo^n,k} > 

0, otherwise 



where wo,fc(^ = ■ ■ ■ ) M) is the derivative on w.r.t. ro,n,m which can be interpreted as the equivalent 
weight of the m-th RS. As a result, we can use similar subgradient update procedure as in ([TSl l to obtain 
the multipliers A„(i), z^(i), r/„(i). Furthermore, when the data rates for the relay stations are determined, 
the packet partition factors {dm.n.fc} can be determined according to the structure of G**{-). Thus, select 
the best packet partition factors which can achieve the curve of 
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Appendix B: Proof of Lemma [T] 

According to Appendix A, it with probability 1 that one subchannel is allocated to only one user or 
relay station in phase one. Moreover, since channel between the base station and relay station is good 
enough, one subchannel is sufficient to carry the data for the phase two transmission. Therefore, it with 
probability 1 that one relay is allocated at most one subchannel. Thus, for any relay station, there is only 
one positive value in the set of rate allocation {ro,„,m|Vn}. Notice that G**{0) = 0, we have 

TV 



n n=l 

N 



n=l 



Hence, 



TV 



n n=l 

N 

= ^{1 - e)^o,nG*^{ro,n,m\S^,'tlm) (19) 



n=l 

This complete the proof. 

Appendix C: Proof of Lemma [2] 

Without loss of generality, we consider the m-th cluster. Suppose there are L QoS classes and denote 
wi as the weight of the ^-th QoS class. Since there is sufficiently large number of users in each cluster, 
the receiving SNR of the selected users will be sufficiently large, therefore, equal power allocation is 
asymptotically optimal. Moreover, since the relay station is only likely to pick up the best users (with 
the largest) from each QoS class, and the channel fading of the best user tends to be a constant (e.g. 
In K) when the number of users is sufficiently large, which subchannel is allocated to which QoS class 
become independent of the channel fading. Hence, the optimal resource allocation is to do time-sharing 
among the L class. 

Let g^ i denote the maximum average weighted throughput of the m-th cluster if there are sufficient 
information bits at the relay and only the users of the Z-th QoS class are scheduled, and {r^ nk} 
rate allocation leading to the maximum average weighted throughput g^^i. We define ; = k^'^mnk 
denoting the corresponding total transmit data rate. These two parameters can be evaluated by each relay 
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locally. We first construct a function of Qmir) {Gm : 7^ — > 7^) below (An example of Gmif) is shown in 
Figure [3]): 

• Plot points {{rm^i, gm^i)\s/l} on a plane. This refers to the points B and C in Figure [3l 

• Let T-L be the convex hull of the points {{rm,h gm,i)\^l} and (0,0). This refers to the triangle ABC 
in Figure [3l 

• Define a region T-L as H = {{r, g)\3{r, g^) £ H, ge < g}- This refers to the area bounded by line 
ABCD and x-axis in Figure [3] Therefore, for any given r, all the average weighted throughput in 
the set {g\{r,g) € V.} can is achievablq^ by the cluster m using TDMA in each frame. 

. Qm{r) = max{g|(r, G %}. This refers to the line ABCD in Figure [3l 
An example of Therefore, G*^{r) > Qmir)- Moreover, since ^g^j^j"^^^ — ^ for sufficiently large 
number of users in each QoS class, it's asymptotically optimal to have GJ^(r) = Qm i''')- 

Appendix D: Proof of Lemma [3] 

Without loss of generality, we consider the m-th cluster. Since the number of MSs in the m-th cluster 
Km is sufficiently large, the sensing measure Pjn,n — ^ Sm,n and the system is working on the high SNR 
regime. Therefore, the throughput gain of power allocation across the subchannels is negligible and we 
can simply assign equal power to each available subchannel, thus Pm,n = ^ff"''o — Po- 

We first consider the case where r > Rm = X]m=i ^oS2 (1 + Pm,nlm,n,k^m,n,k) ■ In this case, there 
are sufficient information bits at the relay for phase two transmission. Then the selected MS of the n-th 
subchannel and the m-th cluster is given by Am,n = arg max^ i^m.fc log2 (1 + Pm,n7m,n.feVm.n,fc)> and 

^mXr) = Ln=l — log2(l +Pm,nim,n,A„,„95m,n,A„,„)- 

For the case where r < Rm, it's easy to see by linear interpolation that Gm{r) > 
Y^n=i ''"'"'"^"'4"/^"'"^^ ^og2{l +Pm,Jm,n,Am.^^m,n,A^,„)- Howcvcr, sincc the BS-RS link is sufficiently 
good, the BS always delivers Rm bits to the m-th relay. Hence, we can simply let G*m{r) = 
J2n=i "'"'"'"^TRr "^^ ^"62(1 + Pm,nlm,n,A,„_^'^m,n,A^,„), which docs not affcct the Scheduling results 
at the BS. 

'"An average weighted throughput g is achievable when there is a joint power, rate and subchannel allocation at the cluster 
m such that the average weighted throughput is equal to g. 
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Appendix E: Proof of Theorem [T] 

Due to page limitation, we provide a sketch of proof. When the RS-BS link is sufficiently good due 
to the existence of line-of-sight path, the relay will always receive sufficiently information bits as long 
as there is one available subchannel in cluster 0, and the PFS algorithm in each relay cluster works as 
that in single cell systems with infinite backlog. Hence, we can follow the similar approach as in |[27l to 
prove that when is sufficiently large, the user selection is based on the small-scale channel fading, 
which leads to (fTSl) . 

Since there are N subchannels in the system, the probability the BS can not deliver packets to the relays 
is . Hence, in each cluster the probability one subchannel is used to deliver packet is (1 — — qp )■ 
Again, by following the similar approach as in iflTl . the Tm^k can be derived after some algebra. 
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Fig. 1. Cluster based relay-assisted downlink OFDMA system. 
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Fig. 2. A frame structure example for relay-assisted OFDMA system. 
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Fig. 3. An example of function G" where there are two QoS classes with weights wi, W2 and maximum achievable data 
rate r^.i ,rm,2- The x-axis is the number of information bits the RS decoded in phase one, and the y-axis is the maximum 
average weighted goodput achieved by this RS. When the number of information bits is less than rm,i, only the first QoS class 
is scheduled; when it's larger than rm,i but less than rm,2, both two classes are scheduled by TDMA; and when it's larger than 
rm.2, only the second QoS class is scheduled. 
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Fig. 4. X]fe log Rk and access probability of MSs in Cluster m (m = 1, • • • , M) versus probability of PU transmission Qpo. 
qf = 0.2, Qd = 0.8, A/=6, 7V=4, iCo=10, K„,=5, 1=0 dB, receive SNR = 10 dB, = 0.01. 
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Fig. 5. Histogram of the average goodput of MSs (average data rate successfully received at the IVISs) at various distance from 
the BS. Qpa = Pp,n = 0.3, qf = 0.2, qd = 0.8, AI=6, N=4, Ko=lO, Km=5, 1=0 dB, receive SNR = 10 dB, a'i = 0.01. 
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Fig. 7. log Rk and access probability of MSs in Cluster m (m = 1, • • • , M) versus receive SNR. qpo = 0.3,g/ — 0.2, 

qa = 0.8, M=6, N=4, A'o=10, K,n=5, 1=0 dB, erf = 0.01. 
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Fig. 8. I]felogi?fe versus the number of MSs in a cell, qpo = 0.3, g/ = 0.2, qd = 0.8, M=6, N=4, 1=0 dB, receive 
SNR = 10 dB, a'i = 0.01. 
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Fig. 9. Average system goodput (average data rate successfully received at the MSs) versus CSIT quality. Qpo = ppm = 0.3, 
qf = 0.2, qa = 0.8, M=6, N=4, A'o=10, Km=5, 1=0 dB, receive SNR = 10 dB. 
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